NTCIREVAL: A Generic Toolkit for Information Access Evaluation

نویسنده

  • Tetsuya Sakai
چکیده

Over the past decades, Information Access (IA) tasks have evolved and diversified. For example, in the mid20th century, Information Retrieval (IR) was about set retrieval for libraries; then with the advent of the digital information overload era, ranked retrieval became a necessity; now in the 21st century, we are experiencing richer forms of IR such as diversified Web search in order to satisfy ambiguous and underspecified queries [25]. Moreover, with the progress in natural language processing, automatic Question Answering (QA) [10, 15, 20] and leveraging Community QA (CQA) data [24] have become feasible. Many of these IA tasks involve automatic ranking of items (e.g. documents or answer strings). To ensure progress in IA research, reliable evaluation metrics are an absolute necessity. Given an IA task definition, an evaluation metric should be designed so that it can guide the system towards the right goal of that particular task. Hence, together with IA tasks, IA evaluation methods and metrics have also evolved and diversified. This paper introduces a tookit for evaluating a variety of IA tasks, called NTCIREVAL, designed primarily for tasks that involve ranking of items. NTCIREVAL is available at http://research.nii.ac. jp/ntcir/tools/ntcireval-en.html. (This paper discusses the version released in April 2011.) It works on UNIX/Linux platforms. While NTCIREVAL can handle some of the ongoing and past IA tasks of NTCIR, the sesquiannual IA evaluation workshop run by National Institute of Informatics, it is a generic toolkit that can be used for other IA tasks. The main objective of this paper is to provide an overview of the philosophy behind and functionalities of NTCIREVAL, so that IA researchers can quickly understand and utilise it whenever appropriate. Because IA research relies much on experimentation, sharing such an evaluation toolkit among the IA researchers should help enhance the reproducibility of experiments, and also foster discussions on how to better evaluate IA tasks. This paper should also serve as a noncomprehensive survey of recent developments in the field of IA evaluation metrics. The remainder of this paper is organised as follows. Section 2 discusses the design philosophy of NTCIREVAL. Section 3 explains how NTCIREVAL can be used for traditional ranked retrieval evaluation and its extensions. Section 4 explains how it can be used for diversified search evaluation. Finally, Section 5 summarises this paper and provides some general recommendations for IA researchers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A MATLAB® Toolkit for Spatial and Temporal Analysis of Network Traffic Anomalies and a Simulator/Emulator for Network Traffic Anomalies

An easily customizable toolkit used to reveal spatial and temporal properties of network traffic traces and a simulator/emulator that regenerates anomalies having statistically similar anomalies to real networks is developed. The analyzer toolkit is fed with network traces as inputs, and anomalies are identified along with their properties. The toolkit uses Fourier analysis to suppress prominen...

متن کامل

Evaluation of Dental Students’ Access, Knowledge, and Usage Regarding Information Technology in Dentistry

Abstract   Background and Aims: Information technology (IT) can make a powerful contribution to dental education and practice. The aim of the present study was to determine access, knowledge and usage of IT among dental students of Islamic Azad University of Esfahan in 2015. Materials and Methods: We conducted a cross-sectional study using a stratified random sampling method in 2016. A validat...

متن کامل

Evaluation of computer knowledge and information literacy among dental students of school of dentistry, Tehran medical sciences, Islamic Azad university in 2016

Background and Aims: Use of computer and internet in our modern and digitalized society is an efficient tool to be up to date and fulfil the information gaps. The aim of this study was to evaluate the extent of computer and internet usage among dental students of Islamic Azad University, Tehran, in 2016. Materials and Methods: This descriptive study for evaluating access and usage of computer ...

متن کامل

Evaluation of the Routs of Oral Health Information Being Delivered to Yazd Population in 2011

Introduction: The discipline of oral public health is known as a science and art of dealing with populationoral health. In order to improve public awareness followed by changing people`s life style itis necessary for the society to be exposed to massive oral health information.The aim of this study was to evaluate the routes of oral health information being delivered to the Yazd population in1...

متن کامل

Semantic Annotation, Analysis and Comparison: A Multilingual and Cross-lingual Text Analytics Toolkit

Within the context of globalization, multilinguality and cross-linguality for information access have emerged as issues of major interest. In order to achieve the goal that users from all countries have access to the same information, there is an impending need for systems that can help in overcoming language barriers by facilitating multilingual and cross-lingual access to data. In this paper,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011